Purpose:
Examine perinatal periods of risk and examine the spatial variation in feto-infant morality in Fulton and Dekalb counties in Georgia.
Objectives:
1. Use a spatial data class in R, ppp, which is necessary for executing the kernel estimation functions
2. Use kernel density estimation of spatial point processes, including selection of fixed bandwidths, and use of adaptive bandwidths
3. Estimate spatial relative risk surfaces, including estimation of tolerance contours
4. Estimate a statistically-optimal fixed bandwidth and explore adaptive bandwidths for use with the gwss() function
5. Calculate local, spatially-weighted mean, median, SD, and IQR for four census-tract level continuous measures using kernel density functions
6. Using Monte Carlo simulation to produces significance contours on our estimates of local, spatially- weighted summary statistics
7. Calculate local, spatially-weighted bivariate statistics summarizing how the correlations (Pearson and Spearman) of pairs of variables varies through space
Data Preparation:
library(tidyverse)
library(sp)
library(sf)
library(tmap)
library(spatstat)
library(sparr)
library(maptools)
library(raster)
library(GWmodel)
# This is points for births in Dekalb/Fulton county
b_point <- st_read('birth_points.gpkg')
## Reading layer `birth_points' from data source `/Users/brookelappe/Desktop/EPI 563/birth_points.gpkg' using driver `GPKG'
## Simple feature collection with 94373 features and 0 fields
## geometry type: POINT
## dimension: XY
## bbox: xmin: 1029894 ymin: 1222298 xmax: 1098234 ymax: 1301535
## projected CRS: NAD83 / Conus Albers
# This is points for deaths in Dekalb/Fulton county
d_point <- st_read('death_points.gpkg')
## Reading layer `death_points' from data source `/Users/brookelappe/Desktop/EPI 563/death_points.gpkg' using driver `GPKG'
## Simple feature collection with 701 features and 0 fields
## geometry type: POINT
## dimension: XY
## bbox: xmin: 1040557 ymin: 1224819 xmax: 1095361 ymax: 1294009
## projected CRS: NAD83 / Conus Albers
# This is an outline of Dekalb/Fulton county to be used as a study 'window'
county <- st_read('DekalbFultonWindow.gpkg') %>%
as('Spatial')
## Reading layer `DekalbFultonWindow' from data source `/Users/brookelappe/Desktop/EPI 563/DekalbFultonWindow.gpkg' using driver `GPKG'
## Simple feature collection with 1 feature and 2 fields
## geometry type: POLYGON
## dimension: XY
## bbox: xmin: 1026366 ymin: 1220671 xmax: 1098719 ymax: 1302019
## projected CRS: NAD83 / Conus Albers
# This is Dekalb/Fulton census tracts
atl <- st_read('Fulton-Dekalb-covariates.gpkg') %>%
as('Spatial') # convert to sp class
## Reading layer `Fulton-Dekalb-covariates' from data source `/Users/brookelappe/Desktop/EPI 563/Fulton-Dekalb-covariates.gpkg' using driver `GPKG'
## Simple feature collection with 345 features and 6 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: 1026366 ymin: -386115.2 xmax: 1098719 ymax: -304767
## projected CRS: NAD83 / Conus Albers
Briefly examine the data sets:
summary(b_point)
## geom
## POINT :94373
## epsg:5070 : 0
## +proj=aea ...: 0
summary(d_point)
## geom
## POINT :701
## epsg:5070 : 0
## +proj=aea ...: 0
Introducing kernel density estimation (KDE).
KDE is a non-parametric tool that can estimate the varying spatial intensity of a spatial point process (spp) and estimate a spatial relative risk surface representing spatial variation in event occurance conditional on the population at risk.
Create the planar point process (ppp) data object using list of x, y coordinates for event points and definition for spatial window. Dekalb and Fulton counties are the customized spatial window for fetal births and deaths.
# create the owin class object and examine it with the code below
county_owin <-
maptools::as.owin.SpatialPolygons(county)
summary(county_owin)
## Window: polygonal boundary
## single connected closed polygon with 698 vertices
## enclosing rectangle: [1026366.3, 1098719.2] x [1220671, 1302019.3] units
## (72350 x 81350 units)
## Window area = 2086320000 square units
## Fraction of frame area: 0.354
plot(county_owin)

# create the birth ppp object
b_ppp <- ppp(x = st_coordinates(b_point)[, 1],
y = st_coordinates(b_point)[, 2],
window = county_owin)
# create the death ppp object
d_ppp <- ppp(x = st_coordinates(d_point)[, 1],
y = st_coordinates(d_point)[, 2],
window = county_owin)
# summarize and plot
summary(d_ppp)
## Planar point pattern: 701 points
## Average intensity 3.359978e-07 points per square unit
##
## Coordinates are given to 2 decimal places
## i.e. rounded to the nearest multiple of 0.01 units
##
## Window: polygonal boundary
## single connected closed polygon with 698 vertices
## enclosing rectangle: [1026366.3, 1098719.2] x [1220671, 1302019.3] units
## (72350 x 81350 units)
## Window area = 2086320000 square units
## Fraction of frame area: 0.354
plot(d_ppp)

summary(b_ppp)
## Planar point pattern: 94373 points
## Average intensity 4.523412e-05 points per square unit
##
## Coordinates are given to 2 decimal places
## i.e. rounded to the nearest multiple of 0.01 units
##
## Window: polygonal boundary
## single connected closed polygon with 698 vertices
## enclosing rectangle: [1026366.3, 1098719.2] x [1220671, 1302019.3] units
## (72350 x 81350 units)
## Window area = 2086320000 square units
## Fraction of frame area: 0.354
plot(b_ppp)

The summary includes information about the overall spatial intensity, number of points, and observational window.
Bandwith Selection
Fixed bandwith: A single value if h designates that the width of the kernel (and thus the resulting smoothness of the estimated intensity surface) is the same for the entire study region. Fixed bandwidths are commonly used, but choosing a single value can be challenging in practice when the density of points varies substantially across the study region.
Adaptive bandwith: As the name implies, this approach changes the size of the kernel density bandwidth according to the density of points (data) in differing sub-areas of the overall study region. The result is is relatively more smoothing (larger bandwidth) in areas with sparse point data, and relatively less smoothing (smaller bandwidth) in areas with more point density.
Oversmoothing to identify mazimum amount of smoothing neccesary for minimizing statistical error. Use funtion OS().
# oversmoothing algorithm with function OS()
h_os_d <- OS(d_ppp)
h_os_b <- OS(b_ppp)
Fixed bandwith KDE
Use the oversmooth estimate of the upper bound for bandwith for each point process.
#fixed bandwith KDE with bivariate.density()
death_kde <- bivariate.density(d_ppp, h0 = h_os_d, edge = 'diggle')
birth_kde <- bivariate.density(b_ppp, h0 = h_os_b, edge = 'diggle')
summary(birth_kde)
## Bivariate Kernel Density/Intensity Estimate
##
## Bandwidth
## Fixed smoothing with h0 = 1897.371 units (to 4 d.p.)
##
## No. of observations
## 94373
##
## Spatial bound
## Type: polygonal
## 2D enclosure: [1026366, 1098719] x [1220671, 1302019]
##
## Evaluation
## 128 x 128 rectangular grid
## 5808 grid cells out of 16384 fall inside study region
## Density/intensity range [6.313906e-14, 1.585466e-09]
names(birth_kde)
## [1] "z" "h0" "hp" "h" "him" "q"
## [7] "gamma" "geometric" "pp"
#quickly visualize resulting fixed bandwith KDE plot
par(mfrow = c(1, 2))
plot(birth_kde, main = 'Birth density')
plot(death_kde, main = 'Death density')

par(mfrow = c(1,1))
Adaptive bandwith KDE
Implementation of algorithem that adapts the bandwith across study region in response to the density or sparseness of the data. Requires specification of global banthwith, and adaptation makes global smaller or larger as needed.
#adaptive bandwith KDE with bivariate.density()
death_kde_adapt <- bivariate.density(d_ppp,
h0 = h_os_d,
edge = 'diggle',
adapt = TRUE,
verbose = FALSE)
birth_kde_adapt <- bivariate.density(b_ppp,
h0 = h_os_d,
edge = 'diggle',
adapt = TRUE,
verbose = FALSE)
##quickly visualize resulting adaptive bandwith KDE plot
par(mfrow = c(1, 2))
plot(birth_kde_adapt, main = 'Birth density\n(adaptive h)')
plot(death_kde_adapt, main = 'Death density\n(adaptive h)')

par(mfrow = c(1,1))
Plotting KD estimates with tmap()
#convert image data to raster data
death_kde_raster <- raster(death_kde_adapt$z,
crs = "+init=epsg:5070")
birth_kde_raster <- raster(birth_kde_adapt$z,
crs = "+init=epsg:5070")
#create map of death surface
m1 <- tm_shape(death_kde_raster) + tm_raster(palette = 'BuPu',
style = 'cont',
title = 'Death density') +
tm_layout(legend.format = list(scientific = T))
#create map of brith surface
m2 <- tm_shape(birth_kde_raster) +
tm_raster(palette = 'BuPu',
style = 'cont',
title = 'Birth density') +
tm_layout(legend.format = list(scientific = T))
tmap_arrange(m1,m2)

Creating relative risk surface manually
To manually create spatial relative risk surface, take ratio of KDE intesity surface which will give essentially an SMR (relative deviation of each area from overall average value). Values below 1 have lower than average risk, and values above 1 have higher average risk.
#create risk surface as ratio of death density to birth density
risk <- death_kde_raster / birth_kde_raster
#map SMR
tm_shape(risk) +
tm_raster(palette = '-RdYlGn',
style = 'cont',
breaks = c(0.1, 0.6, 0.9, 1.1, 2, 4.9),
title = 'IMR SMR')

Creating relative risk surface with risk() function
Sparr package provides shortcut for estimation of spatial relative risk surfaces in function risk(). Also illustrates another feature which allows for quantifying statistical precision by creating tolerance contours. Tolerance contours are simply lines which encircle regions that are statistically significant below a given threshold.
By default, the function estimates the log relative risk, which is a helpful reminder that the relative risk is asymmetric. However, we understand ratio measures, and will be careful to plot the results appropriately. For that reason, I set log = FALSE, although obviously you could omit that and keep everything on the log scale.
imr1000 <- risk(d_ppp, b_ppp, h0 = 1000,
tolerate = T,
verbose = F,
log = F,
edge = 'diggle')
imr2000 <- risk(d_ppp, b_ppp, h0 = 2000,
tolerate = T,
log = F,
edge = 'diggle',
verbose = F)
imr4000 <- risk(d_ppp, b_ppp, h0 = 4000,
tolerate = T,
log = F,
edge = 'diggle',
verbose = F)
imr8000 <- risk(d_ppp, b_ppp, h0 = 8000,
tolerate = T,
log = F,
edge = 'diggle',
verbose = F)
imradapt <- risk(d_ppp, b_ppp, h0 = h_os_d,
adapt = T,
tolerate = T,
log = F,
edge = 'diggle',
verbose = F)
summary(imr1000)
## Log-Relative Risk Function.
##
## Estimated risk range [-1.967191e-13, 11.05946]
##
## --Numerator (case) density--
## Bivariate Kernel Density/Intensity Estimate
##
## Bandwidth
## Fixed smoothing with h0 = 1000 units (to 4 d.p.)
##
## No. of observations
## 701
##
## Spatial bound
## Type: polygonal
## 2D enclosure: [1026366, 1098719] x [1220671, 1302019]
##
## Evaluation
## 128 x 128 rectangular grid
## 5808 grid cells out of 16384 fall inside study region
## Density/intensity range [-2.491416e-25, 2.78467e-09]
##
## --Denominator (control) density--
## Bivariate Kernel Density/Intensity Estimate
##
## Bandwidth
## Fixed smoothing with h0 = 1000 units (to 4 d.p.)
##
## No. of observations
## 94373
##
## Spatial bound
## Type: polygonal
## 2D enclosure: [1026366, 1098719] x [1220671, 1302019]
##
## Evaluation
## 128 x 128 rectangular grid
## 5808 grid cells out of 16384 fall inside study region
## Density/intensity range [2.143127e-16, 2.383589e-09]
names(imr1000)
## [1] "rr" "f" "g" "P"
#visualize
par(mar = c(1,1,1,1))
par(mfrow = c(3,2))
plot(imr1000)
plot(imr2000)
plot(imr4000)
plot(imr8000)
plot(imradapt)
par(mfrow = c(1,1))

Using functions to map RR with tmap()
How to write simple custom functions in R (set of instructions that accepts inputs and carries out action on those inputs to return to some output) to spead up repetitive tasks.
This is a function that accepts a single argument, labeled simply x here. The expectation is that x should be the output of the above risk() function. Notice how the function first extracts the spatial relative risk surface (e.g. x$rr), and then assigns the appropriate projection (it got stripped off some where along the way). Then the function extracts the probability map which is the set of pixel-specific p-values. The rasterToContour() function takes this raster and creates contour lines with the specified levels corresponding to a 95% tolerance contour. Finally, the use of the return() tells what should be returned when the function is called
### make prepRaster() function
prepRaster <- function(x){
rr <- raster(x$rr,
crs = "+init=epsg:5070")
p_raster <- raster(x$P,
crs = "+init=epsg:5070")
plines <- rasterToContour(p_raster, levels = c(0.025, 0.975))
return(list(rr=rr,plines=plines))
}
Write a function for producing a tmap() panel
### make map () function to create panel maps
make_map <- function(x, bw){
mtitle <- paste('IMR - ', bw, ' smooth', sep = '')
tm_shape(x$rr) +
tm_raster(palette = '-RdYlGn',
style = 'cont',
breaks = c(0.1, 0.6, 0.9, 1.1, 2, 4.9),
midpoint = NA,
title = 'IMR SMR') +
tm_shape(x$plines) +
tm_lines(col = 'level',
legend.col.show = F) +
tm_layout(main.title = mtitle,
legend.format = list(digits = 1))
}
Map the four fixed-bandwith spatial relative risk surfaces using new map function
#convert to raster and extract p-value contours
rr_1000 <- prepRaster(imr1000)
rr_2000 <- prepRaster(imr2000)
rr_4000 <- prepRaster(imr4000)
rr_8000 <- prepRaster(imr8000)
rr_adapt <- prepRaster(imradapt)
#produce map panels
m1000 <- make_map(rr_1000, '1 km')
m2000 <- make_map(rr_2000, '2 km')
m4000 <- make_map(rr_4000, '4 km')
m8000 <- make_map(rr_8000, '8 km')
mapadapt <- make_map(rr_adapt, 'adaptive')
tmap_arrange(m1000, m2000, m4000, m8000, mapadapt, ncol = 2)

Using function GWModel for exploratory spatial data analysis (spatial non-stationary) of ATL dataset
The function gwss() stands for geographically weighted summary statistics, and uses the non- parametric spatial weighting of a kernel density function to compute locally varying descriptive statistics such as the mean, median, standard deviation, correlation, and more. But any statistic can be stationary (constant) or non-stationary (spatially varying).
Mapping the observed values
# First map the 4 variables that are %
tm_shape(atl) +
tm_fill(c('pctNOHS', 'pctPOV', 'pctMOVE', 'pctOWNER_OCC'),
style = 'quantile') +
tm_borders()

The Index of Concentration at the Extremes (ICE) ranges from -1 to +1. A value of −1 occurs where everyone in the tract is poor; a value of +1 occurs in tracts where everyone is affluent; a value of 0 suggests that either there is a balance of affluence and poverty, or alternatively that everyone is ‘middle income’. Therefore it makes sense to map it separately because it will inevitably need a divergent color ramp.
tm_shape(atl) + tm_fill('ICE_INCOME_all',
style = 'quantile',
palette = 'div') +
tm_borders()
## Variable(s) "ICE_INCOME_all" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.

Geographically weighted summary statistics using gwss()
atl.ss <- gwss(atl, vars = c('pctPOV', 'pctNOHS',
'pctMOVE', 'pctOWNER_OCC',
'ICE_INCOME_all'),
adaptive = T,
bw = 35,
quantile = T)
## Warning in proj4string(data): CRS object has comment, which is lost in output
Summary gives you information for the range of smoothed values for each statistic and each variable.All of the estimates of the geographically weighted mean end with _LM which stands for local mean. Similarly the estimates of geographically weighted standard deviation end with _LSD for local SD.
Mapping gwas results
names(atl.ss)
## [1] "SDF" "vars" "kernel" "adaptive" "bw" "p"
## [7] "theta" "longlat" "DM.given" "sp.given" "quantile"
summary(atl.ss)
## Length Class Mode
## SDF 345 SpatialPolygonsDataFrame S4
## vars 5 -none- character
## kernel 1 -none- character
## adaptive 1 -none- logical
## bw 1 -none- numeric
## p 1 -none- numeric
## theta 1 -none- numeric
## longlat 1 -none- logical
## DM.given 1 -none- logical
## sp.given 1 -none- logical
## quantile 1 -none- logical
summary(atl.ss$SDF)
## Object of class SpatialPolygonsDataFrame
## Coordinates:
## min max
## x 1026366.3 1098719
## y -386115.2 -304767
## Is projected: TRUE
## proj4string :
## [+proj=aea +lat_0=23 +lon_0=-96 +lat_1=29.5 +lat_2=45.5 +x_0=0 +y_0=0
## +datum=NAD83 +units=m +no_defs]
## Data attributes:
## pctPOV_LM pctNOHS_LM pctMOVE_LM pctOWNER_OCC_LM
## Min. :0.05065 Min. :0.01134 Min. :0.1292 Min. :0.2142
## 1st Qu.:0.11977 1st Qu.:0.03180 1st Qu.:0.1666 1st Qu.:0.4210
## Median :0.18084 Median :0.05198 Median :0.1857 Median :0.4869
## Mean :0.19734 Mean :0.05741 Mean :0.1957 Mean :0.4891
## 3rd Qu.:0.27087 3rd Qu.:0.07904 3rd Qu.:0.2159 3rd Qu.:0.5683
## Max. :0.39470 Max. :0.13690 Max. :0.3356 Max. :0.7545
## ICE_INCOME_all_LM pctPOV_LSD pctNOHS_LSD pctMOVE_LSD
## Min. :-0.363659 Min. :0.03047 Min. :0.01302 Min. :0.04340
## 1st Qu.:-0.161687 1st Qu.:0.08096 1st Qu.:0.02774 1st Qu.:0.06214
## Median :-0.010174 Median :0.09503 Median :0.03853 Median :0.07097
## Mean :-0.003091 Mean :0.10274 Mean :0.04617 Mean :0.07694
## 3rd Qu.: 0.180475 3rd Qu.:0.12299 3rd Qu.:0.05589 3rd Qu.:0.08552
## Max. : 0.392141 Max. :0.19999 Max. :0.15118 Max. :0.15943
## pctOWNER_OCC_LSD ICE_INCOME_all_LSD pctPOV_LVar pctNOHS_LVar
## Min. :0.1285 Min. :0.09873 Min. :0.0009282 Min. :0.0001695
## 1st Qu.:0.1711 1st Qu.:0.15379 1st Qu.:0.0065551 1st Qu.:0.0007695
## Median :0.1930 Median :0.18624 Median :0.0090312 Median :0.0014845
## Mean :0.2030 Mean :0.18344 Mean :0.0116555 Mean :0.0028788
## 3rd Qu.:0.2360 3rd Qu.:0.21418 3rd Qu.:0.0151262 3rd Qu.:0.0031239
## Max. :0.3127 Max. :0.29488 Max. :0.0399940 Max. :0.0228544
## pctMOVE_LVar pctOWNER_OCC_LVar ICE_INCOME_all_LVar pctPOV_LSKe
## Min. :0.001884 Min. :0.01650 Min. :0.009747 Min. :-0.6030
## 1st Qu.:0.003862 1st Qu.:0.02929 1st Qu.:0.023652 1st Qu.: 0.4015
## Median :0.005037 Median :0.03727 Median :0.034685 Median : 0.8876
## Mean :0.006367 Mean :0.04293 Mean :0.035527 Mean : 1.0405
## 3rd Qu.:0.007314 3rd Qu.:0.05572 3rd Qu.:0.045875 3rd Qu.: 1.5071
## Max. :0.025418 Max. :0.09779 Max. :0.086957 Max. : 4.5718
## pctNOHS_LSKe pctMOVE_LSKe pctOWNER_OCC_LSKe ICE_INCOME_all_LSKe
## Min. :-0.2635 Min. :-0.4316 Min. :-1.2109 Min. :-1.0305
## 1st Qu.: 0.8059 1st Qu.: 0.2789 1st Qu.:-0.4108 1st Qu.:-0.1805
## Median : 1.3878 Median : 0.5831 Median :-0.1053 Median : 0.1766
## Mean : 1.7775 Mean : 0.6647 Mean :-0.1088 Mean : 0.1911
## 3rd Qu.: 2.4145 3rd Qu.: 0.9743 3rd Qu.: 0.1741 3rd Qu.: 0.5539
## Max. : 7.7612 Max. : 2.6409 Max. : 1.1014 Max. : 1.8477
## pctPOV_LCV pctNOHS_LCV pctMOVE_LCV pctOWNER_OCC_LCV
## Min. :0.2128 Min. :0.3199 Min. :0.2346 Min. :0.2011
## 1st Qu.:0.4155 1st Qu.:0.4924 1st Qu.:0.3534 1st Qu.:0.3505
## Median :0.5774 Median :0.8506 Median :0.3874 Median :0.4275
## Mean :0.5950 Mean :0.9173 Mean :0.3945 Mean :0.4331
## 3rd Qu.:0.7349 3rd Qu.:1.3048 3rd Qu.:0.4300 3rd Qu.:0.5156
## Max. :1.1784 Max. :2.2432 Max. :0.6638 Max. :0.7493
## ICE_INCOME_all_LCV Cov_pctPOV.pctNOHS Cov_pctPOV.pctMOVE
## Min. :-127.7859 Min. :-0.0022931 Min. :-0.0029894
## 1st Qu.: -1.0292 1st Qu.: 0.0009012 1st Qu.: 0.0009079
## Median : -0.3521 Median : 0.0018075 Median : 0.0022298
## Mean : -0.4957 Mean : 0.0027749 Mean : 0.0027372
## 3rd Qu.: 1.0390 3rd Qu.: 0.0033044 3rd Qu.: 0.0044672
## Max. : 94.7817 Max. : 0.0180077 Max. : 0.0098743
## Cov_pctPOV.pctOWNER_OCC Cov_pctPOV.ICE_INCOME_all Cov_pctNOHS.pctMOVE
## Min. :-0.039120 Min. :-0.054587 Min. :-6.067e-03
## 1st Qu.:-0.020229 1st Qu.:-0.021778 1st Qu.: 3.601e-05
## Median :-0.014517 Median :-0.014607 Median : 3.318e-04
## Mean :-0.015524 Mean :-0.016514 Mean : 3.415e-04
## 3rd Qu.:-0.009564 3rd Qu.:-0.009578 3rd Qu.: 7.484e-04
## Max. :-0.001562 Max. :-0.002345 Max. : 3.823e-03
## Cov_pctNOHS.pctOWNER_OCC Cov_pctNOHS.ICE_INCOME_all Cov_pctMOVE.pctOWNER_OCC
## Min. :-0.028576 Min. :-0.027722 Min. :-0.026759
## 1st Qu.:-0.005198 1st Qu.:-0.005411 1st Qu.:-0.012237
## Median :-0.002497 Median :-0.002790 Median :-0.008393
## Mean :-0.004441 Mean :-0.004505 Mean :-0.008884
## 3rd Qu.:-0.001219 3rd Qu.:-0.001574 3rd Qu.:-0.005566
## Max. : 0.006500 Max. : 0.003936 Max. : 0.005566
## Cov_pctMOVE.ICE_INCOME_all Cov_pctOWNER_OCC.ICE_INCOME_all Corr_pctPOV.pctNOHS
## Min. :-0.017867 Min. :0.007196 Min. :-0.3170
## 1st Qu.:-0.008468 1st Qu.:0.020528 1st Qu.: 0.3302
## Median :-0.005404 Median :0.031021 Median : 0.5284
## Mean :-0.004998 Mean :0.032966 Mean : 0.5006
## 3rd Qu.:-0.002219 3rd Qu.:0.044465 3rd Qu.: 0.7092
## Max. : 0.014909 Max. :0.068909 Max. : 0.9313
## Corr_pctPOV.pctMOVE Corr_pctPOV.pctOWNER_OCC Corr_pctPOV.ICE_INCOME_all
## Min. :-0.3313 Min. :-0.9260 Min. :-0.9212
## 1st Qu.: 0.1577 1st Qu.:-0.7682 1st Qu.:-0.8552
## Median : 0.3367 Median :-0.7020 Median :-0.8058
## Mean : 0.3220 Mean :-0.6828 Mean :-0.7813
## 3rd Qu.: 0.5194 3rd Qu.:-0.6048 3rd Qu.:-0.7381
## Max. : 0.7694 Max. :-0.3121 Max. :-0.2913
## Corr_pctNOHS.pctMOVE Corr_pctNOHS.pctOWNER_OCC Corr_pctNOHS.ICE_INCOME_all
## Min. :-0.555197 Min. :-0.7423 Min. :-0.7935
## 1st Qu.: 0.009498 1st Qu.:-0.4959 1st Qu.:-0.5456
## Median : 0.134766 Median :-0.3802 Median :-0.4470
## Mean : 0.145947 Mean :-0.3462 Mean :-0.4219
## 3rd Qu.: 0.300808 3rd Qu.:-0.2283 3rd Qu.:-0.3231
## Max. : 0.683280 Max. : 0.4213 Max. : 0.3913
## Corr_pctMOVE.pctOWNER_OCC Corr_pctMOVE.ICE_INCOME_all
## Min. :-0.8912 Min. :-0.8172
## 1st Qu.:-0.7469 1st Qu.:-0.5948
## Median :-0.6134 Median :-0.4580
## Mean :-0.5470 Mean :-0.3573
## 3rd Qu.:-0.4331 3rd Qu.:-0.1890
## Max. : 0.2776 Max. : 0.5593
## Corr_pctOWNER_OCC.ICE_INCOME_all Spearman_rho_pctPOV.pctNOHS
## Min. :0.4761 Min. :-0.3137
## 1st Qu.:0.7364 1st Qu.: 0.3999
## Median :0.8139 Median : 0.5570
## Mean :0.7983 Mean : 0.5370
## 3rd Qu.:0.8681 3rd Qu.: 0.7045
## Max. :0.9618 Max. : 0.9311
## Spearman_rho_pctPOV.pctMOVE Spearman_rho_pctPOV.pctOWNER_OCC
## Min. :-0.3390 Min. :-0.9327
## 1st Qu.: 0.2155 1st Qu.:-0.7724
## Median : 0.3742 Median :-0.6998
## Mean : 0.3469 Mean :-0.6871
## 3rd Qu.: 0.5295 3rd Qu.:-0.6212
## Max. : 0.7608 Max. :-0.3108
## Spearman_rho_pctPOV.ICE_INCOME_all Spearman_rho_pctNOHS.pctMOVE
## Min. :-0.9457 Min. :-0.40721
## 1st Qu.:-0.8764 1st Qu.: 0.05727
## Median :-0.8393 Median : 0.17412
## Mean :-0.8093 Mean : 0.16972
## 3rd Qu.:-0.7747 3rd Qu.: 0.30311
## Max. :-0.3117 Max. : 0.58534
## Spearman_rho_pctNOHS.pctOWNER_OCC Spearman_rho_pctNOHS.ICE_INCOME_all
## Min. :-0.8288 Min. :-0.8405
## 1st Qu.:-0.5518 1st Qu.:-0.6476
## Median :-0.4284 Median :-0.5250
## Mean :-0.3868 Mean :-0.5064
## 3rd Qu.:-0.2455 3rd Qu.:-0.3814
## Max. : 0.2446 Max. : 0.1699
## Spearman_rho_pctMOVE.pctOWNER_OCC Spearman_rho_pctMOVE.ICE_INCOME_all
## Min. :-0.8851 Min. :-0.8330
## 1st Qu.:-0.7314 1st Qu.:-0.5600
## Median :-0.6195 Median :-0.4284
## Mean :-0.5577 Mean :-0.3450
## 3rd Qu.:-0.4501 3rd Qu.:-0.1736
## Max. : 0.3082 Max. : 0.4005
## Spearman_rho_pctOWNER_OCC.ICE_INCOME_all pctPOV_Median pctNOHS_Median
## Min. :0.4549 Min. :0.04247 Min. :0.003663
## 1st Qu.:0.7123 1st Qu.:0.09031 1st Qu.:0.014632
## Median :0.7973 Median :0.16100 Median :0.042506
## Mean :0.7746 Mean :0.17781 Mean :0.043469
## 3rd Qu.:0.8524 3rd Qu.:0.25192 3rd Qu.:0.068828
## Max. :0.9477 Max. :0.39190 Max. :0.105287
## pctMOVE_Median pctOWNER_OCC_Median ICE_INCOME_all_Median pctPOV_IQR
## Min. :0.09583 Min. :0.1919 Min. :-0.38411 Min. :0.02725
## 1st Qu.:0.15863 1st Qu.:0.3920 1st Qu.:-0.19501 1st Qu.:0.08469
## Median :0.18618 Median :0.4902 Median :-0.02393 Median :0.11507
## Mean :0.18653 Mean :0.4950 Mean :-0.01409 Mean :0.12983
## 3rd Qu.:0.20178 3rd Qu.:0.6023 3rd Qu.: 0.17007 3rd Qu.:0.17100
## Max. :0.31080 Max. :0.8389 Max. : 0.38298 Max. :0.33702
## pctNOHS_IQR pctMOVE_IQR pctOWNER_OCC_IQR ICE_INCOME_all_IQR
## Min. :0.007199 Min. :0.04923 Min. :0.06952 Min. :0.07414
## 1st Qu.:0.023346 1st Qu.:0.07772 1st Qu.:0.24067 1st Qu.:0.18532
## Median :0.036208 Median :0.09437 Median :0.29305 Median :0.23079
## Mean :0.048902 Mean :0.09872 Mean :0.31103 Mean :0.24477
## 3rd Qu.:0.056780 3rd Qu.:0.11432 3rd Qu.:0.36347 3rd Qu.:0.29328
## Max. :0.196135 Max. :0.21948 Max. :0.66631 Max. :0.55967
## pctPOV_QI pctNOHS_QI pctMOVE_QI pctOWNER_OCC_QI
## Min. :-0.84239 Min. :-0.937694 Min. :-0.71527 Min. :-0.7647
## 1st Qu.:-0.35758 1st Qu.:-0.516389 1st Qu.:-0.28486 1st Qu.:-0.1424
## Median :-0.16139 Median :-0.260145 Median :-0.07138 Median : 0.0615
## Mean :-0.12672 Mean :-0.243326 Mean :-0.05734 Mean : 0.0391
## 3rd Qu.: 0.08367 3rd Qu.: 0.008415 3rd Qu.: 0.14546 3rd Qu.: 0.2285
## Max. : 0.74983 Max. : 0.496576 Max. : 0.70005 Max. : 0.7302
## ICE_INCOME_all_QI
## Min. :-0.86704
## 1st Qu.:-0.23682
## Median :-0.03107
## Mean :-0.02864
## 3rd Qu.: 0.16363
## Max. : 0.88027
There are many sub-objects within the main result object. But the first one, always called SDF has class SpatialPolygonsDataFrame. The is basically the sp version of a polygon spatial file. Examining it more closely ee that it has the information we need to make maps (e.g. it is a spatial object with attribute data).
#map geographically-weighted median
tm_shape(atl.ss$SDF) +
tm_fill(c('pctPOV_Median',
'pctNOHS_Median',
'pctMOVE_Median',
'pctOWNER_OCC_Median'),
style = 'quantile') +
tm_borders()

#map geographically-weighted IQR
tm_shape(atl.ss$SDF) +
tm_fill(c('pctPOV_IQR',
'pctNOHS_IQR',
'pctMOVE_IQR',
'pctOWNER_OCC_IQR'),
style = 'quantile') +
tm_borders()

Calculating pseudo p-values for smoothed estimates
Using Monte Carlo permutation testing. The idea with permutation testing, is empirically simulating what the data would look like under the null distribution. Then compare single instance of observed data to the simulated null distribution to describe how unusual the observations are, given what would have been expected due to chance alone.
The function follows a 3-step process. First it will randomly reassign the location of variables of interest n−times (where n specified by user, but typically reasonably large) Second, for each random permutation of the random variable, the summary statistic (e.g. the spatially-weighted local mean of variable) is calculated. Finally, the observed results is compared to the null distribution. If we calculated n = 999 random permutations, then we would have n = 1000 versions of the summarized statistic, including the observed. The pseudo p-value is calculated as the number of times for each spatial unit that the observed value is more extreme than what would be expected under the null. For example if we defined extreme as being something that happens fewer than 5% of the time by random chance alone, then we might classify our observed value as extreme (and thus significant) if our observed value was either less than the lower 2.5% of the null values, or greater than the upper 97.5% of the null values.
p.val <- gwss.montecarlo(atl, vars = c('pctPOV',
'pctMOVE'),
adaptive = T,
bw = 35,
nsim = 499)
## Warning in proj4string(data): CRS object has comment, which is lost in output
summary(p.val)
## pctPOV_LM pctMOVE_LM pctPOV_LSD pctMOVE_LSD
## Min. :0.0020 Min. :0.0020 Min. :0.0040 Min. :0.0020
## 1st Qu.:0.2600 1st Qu.:0.2540 1st Qu.:0.2500 1st Qu.:0.2480
## Median :0.5060 Median :0.4880 Median :0.5080 Median :0.4960
## Mean :0.5007 Mean :0.5009 Mean :0.5008 Mean :0.5005
## 3rd Qu.:0.7560 3rd Qu.:0.7420 3rd Qu.:0.7500 3rd Qu.:0.7500
## Max. :0.9980 Max. :0.9980 Max. :0.9980 Max. :0.9980
## pctPOV_LVar pctMOVE_LVar pctPOV_LSKe pctMOVE_LSKe
## Min. :0.0040 Min. :0.0020 Min. :0.002 Min. :0.0040
## 1st Qu.:0.2500 1st Qu.:0.2480 1st Qu.:0.248 1st Qu.:0.2580
## Median :0.5080 Median :0.4960 Median :0.500 Median :0.4880
## Mean :0.5008 Mean :0.5005 Mean :0.501 Mean :0.5006
## 3rd Qu.:0.7500 3rd Qu.:0.7500 3rd Qu.:0.750 3rd Qu.:0.7480
## Max. :0.9980 Max. :0.9980 Max. :0.998 Max. :0.9980
## pctPOV_LCV pctMOVE_LCV Cov_pctPOV.pctMOVE Corr_pctPOV.pctMOVE
## Min. :0.0020 Min. :0.0040 Min. :0.0060 Min. :0.002
## 1st Qu.:0.2420 1st Qu.:0.2480 1st Qu.:0.2520 1st Qu.:0.248
## Median :0.5080 Median :0.5080 Median :0.5060 Median :0.510
## Mean :0.4997 Mean :0.5011 Mean :0.5001 Mean :0.500
## 3rd Qu.:0.7460 3rd Qu.:0.7440 3rd Qu.:0.7560 3rd Qu.:0.754
## Max. :0.9980 Max. :0.9980 Max. :0.9980 Max. :0.998
## Spearman_rho_pctPOV.pctMOVE
## Min. :0.0020
## 1st Qu.:0.2520
## Median :0.5020
## Mean :0.5007
## 3rd Qu.:0.7480
## Max. :0.9980
Converting into a map to visually inspect the spatial variation in the geographically-weighted mean of pctPOV.
# First, create TRUE/FALSE vectors testing whether column 1 (pctPOV_LM) is extreme # I am using 2 significance levels: 90% and 95%
sig95 <- p.val[, 1] < 0.025 | p.val[, 1] > 0.975
sig90 <- p.val[, 1] < 0.05 | p.val[, 1] > 0.95
# Second create a spatial object that ONLY contains significant tracts
atl.sig95 <- atl[sig95, ] %>%
aggregate(dissolve = T, FUN = mean) # this is sp code to merge adjacent tracts
## Loading required namespace: rgeos
atl.sig90 <- atl[sig90, ] %>%
aggregate(disolve = T, FUN = mean)
tm_shape(atl.ss$SDF) +
tm_fill('pctPOV_LM',
style = 'quantile',
title = 'Local Average % Poverty') +
tm_borders() +
tm_shape(atl.sig90) +
tm_borders(lwd = 2, col = 'blue') +
tm_shape(atl.sig95) +
tm_borders(lwd = 2, col ='red') +
tm_add_legend(type = 'line',
labels = c('p < 0.05', 'p < 0.10'),
col = c('red', 'blue'))

Suggests that there is a great deal of spatial heterogeneity. However, the permutation test suggests that only a few regions in far North Fulton and in West Atlanta have values that are more extreme than expected under an assumption of spatial independence.
Estimating geographically-weighted bivariate statistics
tm_shape(atl.ss$SDF) +
tm_fill(c('Spearman_rho_pctPOV.pctNOHS',
'Spearman_rho_pctPOV.pctMOVE',
'Spearman_rho_pctPOV.ICE_INCOME_all'),
style = 'quantile') +
tm_borders()
## Variable(s) "Spearman_rho_pctPOV.pctNOHS" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
## Variable(s) "Spearman_rho_pctPOV.pctMOVE" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.

Examining whether differences are more extreme than expected under a null hypothesis of spatial independence and spatial stationarity.
p.val <- gwss.montecarlo(atl,
vars = c('pctPOV',
'pctMOVE'),
adaptive = T,
bw = 35,
nsim = 499)
## Warning in proj4string(data): CRS object has comment, which is lost in output
# First, create TRUE/FALSE vectors testing whether column 1 (pctPOV_LM) is extreme # I am using 2 significance levels: 90% and 95%
sig95 <- p.val[, 13] < 0.025 | p.val[, 13] > 0.975
sig90 <- p.val[, 13] < 0.05 | p.val[, 13] > 0.95
# Second create a spatial object that ONLY contains significant tracts
atl.sig95 <- atl[sig95, ] %>%
aggregate(dissolve = T, FUN = mean)
atl.sig90 <- atl[sig90, ] %>%
aggregate(disolve = T, FUN = mean)
Map correlation between pctPOV and pctMOVE alon with the significance test
tm_shape(atl.ss$SDF) +
tm_fill('Spearman_rho_pctPOV.pctMOVE',
style = 'quantile',
title = 'Local correlation Poverty\n&
Residential instability') +
tm_borders() +
tm_shape(atl.sig90) +
tm_borders(lwd = 2, col = 'blue') +
tm_shape(atl.sig95) +
tm_borders(lwd = 2, col ='red') +
tm_add_legend(type = 'line',
labels = c('p < 0.05', 'p < 0.10'),
col = c('red', 'blue'))
## Variable(s) "Spearman_rho_pctPOV.pctMOVE" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
